Modeling Protein Secondary Structure by Products of Dependent Experts
نویسندگان
چکیده
A phenomenon as complex as protein folding requires a complex model to approximate it. This thesis presents a bottom-up approach for building complex probabilistic models of protein secondary structure by incorporating the multiple information sources which we call experts. Expert opinions are represented by probability distributions over the set of possible structures. Bayesian treatment of a group of experts results in a consensus opinion that combines the experts’ probability distributions using the operators of normalized product, quotient and exponentiation. The expression of this consensus opinion simplifies to a product of the expert opinions with two assumptions: (1) balanced training of experts, i.e., uniform prior probability over all structures, and (2) conditional independence between expert opinions, given the structure. This research also studies how Markov chains and hidden Markov models may be used to represent expert opinion. Closure properties are proven, and construction algorithms are given for product of hidden Markov models, and product, quotient and exponentiation of Markov chains. Algorithms for extracting single-structure predictions from these models are also given. Current product-of-experts approaches in machine learning are top-down modeling strategies that assume expert independence, and require simultaneous training of all experts. This research describes a bottom-up modeling strategy that can incorporate conditionally dependent experts, and assumes separately trained experts.
منابع مشابه
Physicochemical Position-Dependent Properties in the Protein Secondary Structures
Background: Establishing theories for designing arbitrary protein structures is complicated and depends on understanding the principles for protein folding, which is affected by applied features. Computer algorithms can reach high precision and stability in computationally designing enzymes and binders by applying informative features obtained from natural structures. Methods: In this study, a ...
متن کاملProtein Secondary Structure Prediction: a Literature Review with Focus on Machine Learning Approaches
DNA sequence, containing all genetic traits is not a functional entity. Instead, it transfers to protein sequences by transcription and translation processes. This protein sequence takes on a 3D structure later, which is a functional unit and can manage biological interactions using the information encoded in DNA. Every life process one can figure is undertaken by proteins with specific functio...
متن کاملIn silico Analysis and Molecular Modeling of RNA Polymerase, Sigma S (RpoS) Protein in Pseudomonas aeruginosa PAO1
Background: Sigma factors are proteins that regulate transcription in bacteria. Sigma factors can be activated in response to different environmental conditions. The rpoS (RNA polymerase, sigma S) gene encodes sigma-38 (σ38, or RpoS), a 37.8 kDa protein in Pseudomonas aeruginosa (P. aeruginosa) strains. RpoS is a central regulator of the general stress response and operates in both retroa...
متن کاملIn Silico and in Vitroinvestigations on cry4aand cry11atoxins of Bacillus thuringiensis var Israelensis
In the present study we attempted to correlate the structure and function of the cry11a (72 kDa) and cry4a (135 kDa) proteins of Bacillus thuringiensis var israelensis. Homology modeling and secondary structure predictions were done to locate most probable regions for finding helices or strands in these proteins. The JPRED (JPRED consensus secondary structure prediction server) secondary struct...
متن کاملEffects of T208E activating mutation on MARK2 protein structure and dynamics: Modeling and simulation
Microtubule Affinity-Regulating Kinase 2 (MARK2) protein has a substantial role in regulation of vital cellular processes like induction of polarity, regulation of cell junctions, cytoskeleton structure and cell differentiation. The abnormal function of this protein has been associated with a number of pathological conditions like Alzheimer disease, autism, several carcinomas and development of...
متن کامل